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Abstract — We consider optimizing average queueing delay 
and average power consumption in a nonpreemptive multi-class 
M/G/l queue with dynamic power control that affects instan- 
taneous service rates. Four problems are studied: (1) satisfying 
per-class average delay constraints; (2) minimizing a separable 
convex function of average delays subject to per-class delay 
constraints; (3) minimizing average power consumption subject 
to per-class delay constraints; (4) minimizing a separable convex 
function of average delays subject to an average power constraint. 
Combining an achievable region approach in queueing systems 
and the Lyapunov optimization theory suitable for optimizing 
dynamic systems with time average constraints, we propose a 
unified framework to solve the above problems. The solutions are 
variants of dynamic c/i rules, and implement weighted priority 
policies in every busy period, where weights are determined by 
past queueing delays in all job classes. Our solutions require 
limited statistical knowledge of arrivals and service times, and 
no statistical knowledge is needed in the first problem. Overall, we 
provide a new set of tools for stochastic optimization and control 
over multi-class queueing systems with time average constraints. 



I. Introduction 

Stochastic scheduling over multi-class queueing systems 
has important applications such as CPU scheduling, request 
processing in web servers, and QoS provisioning to different 
types of traffic in a telecommunication network. In these sys- 
tems, power management is increasingly important due to their 
massive energy consumption. To study this problem, in this 
paper we consider a single-server multi-class queueing system 
whose instantaneous service rate is controllable by dynamic 
power allocations. This is modeled as a nonpreemptive multi- 
class M/G/l queue with TV job classes {1, . . . , N}, and the 
goal is to optimize average queueing delays of all job classes 
and average power consumption in this queueing network. We 
consider four delay and power control problems: 

1) Designing a policy that yields average queueing delay 
W n of class n satisfying W n < d n for all classes, where 
{g?i, . . . , djy} are given feasible delay bounds. Here we 
assume a fixed power allocation and no power control. 

2) Minimizing a separable convex function VJ n _j f n (W n ) 
of average queueing delays (W n )^ =1 subject to delay 
constraints W n < d n for all classes n; assuming a fixed 
power allocation and no power control. 
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Under dynamic power allocation, minimizing average 
power consumption subject to delay constraints W n < 
d n for all classes n. 

Under dynamic power allocation, minimizing a separa- 
ble convex function XL=i fn(W n ) of average queueing 
delays {W n )n=i subject to an average power constraint. 
These problems are presented with increasing complexity 
for the readers to gradually familiarize themselves with the 
methodology we use to attack these problems. 

Each of the above problems is highly nontrivial, thus novel 
yet simple approaches are needed. This paper provides such a 
framework by connecting two powerful stochastic optimization 
theories: The achievable region approach in queueing systems, 
and the Lyapunov optimization theory in wireless networks. In 
queueing systems, the achievable region approach that treats 
optimal control problems as mathematical programming ones 
has been fruitful; see (TJ— (4J for a detailed survey. In a 
nonpreemptive multi-class M/G/l queue, it is known that the 
collection of all feasible average queueing delay vectors form a 
special polytope (a base of a polymatroid) with vertices being 
the performance vectors of strict priority policies (|5), see 
Section|In]for more details). As a result, every feasible average 
queueing delay vector is attainable by a randomization of strict 
priority policies. Such randomization can be implemented in 
a framed-based style, where a priority ordering is randomly 
deployed in every busy period using a probability distribution 
that is used in all busy periods (see Lemma [T] in Section [in). 
This view of the delay performance region is useful in the first 
two delay control problems. 

In addition to queueing delay, when dynamic power control 
is part of the decision space, it is natural to consider dynamic 
policies that allocate a fixed power in every busy period. 
The resulting joint power and delay performance region is 
then spanned by frame-based randomizations of power control 
and strict priority policies. We treat the last two delay and 
power control problems as stochastic optimization over such 



a performance region (see Section VI-A for an example) 



With the above characterization of performance regions, we 
solve the four control problems using Lyapunov optimization 
theory. This theory is originally developed for stochastic 
optimal control over time-slotted wireless networks (5), JTj, 
later extended by (SJ, (5J that allow optimizing various per- 
formance objectives such as average power [ 1 1 or throughput 
utility fTT) , and recently generalized to optimize dynamic 
systems that have a renewal structure [JT3J — (T3J . The Lyapunov 
optimization theory transforms time average constraints into 
virtual queues that need to be stabilized. Using a Lyapunov 
drift argument, we construct frame-based policies to solve 
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the four control problems. The resulting policy is a sequence 
of base policies implemented frame by frame, where the 
collection of all base policies span the performance region 
through time sharing or randomization. The base policy used 
in each frame is chosen by minimizing a ratio of an expected 
"drift plus penalty" sum over the expected frame size, where 
the ratio is a function of past queueing delays in all job classes. 
In this paper the base policies are nonpreemptive strict priority 
policies with deterministic power allocations. 

Our methodology is as follows. By characterizing the per- 
formance region using the collection of all randomizations of 
base policies, for each control problem, there exists an optimal 
random mixture of base policies that solves the problem. 
Although the probability distribution that defines the optimal 
random mixture is unknown, we construct a dynamic policy 
using Lyapunov optimization theory. This policy makes greedy 
decisions in every frame, stabilizes all virtual queues (thus 
satisfying all time average constraints), and yields near-optimal 
performance. The existence of the optimal randomized policy 
is essential to prove these results. 

In our policies for the four control problems, requests of 
different classes are prioritized by a dynamic cp rule JTJ 
which, in every busy period, assigns priorities in the decreasing 
order of weights associated with each class. The weights of 
all classes are updated at the end of every busy period by 
simple queue-like rules (so that different priorities may be 
assigned in different busy periods), which capture the running 
difference between the current and the desired performance. 
The dynamic c/i rule in the first problem does not require any 
statistical knowledge of arrivals and service times. The policy 
for the second problem requires only the mean but not higher 
moments of arrivals and service times. In the last two problems 
with dynamic power control, beside the dynamic c/i rules, a 
power level is allocated in every busy period by optimizing a 
weighted sum of power and power-dependent average delays. 
The policies for the third and the last problem require the 
mean and the first two moments of arrivals and service times, 
respectively, because of dynamic power allocations. 

In each of the last three problems, our policies yield perfor- 
mance that is at most 0(1/V) away from the optimal, where 
V > is a control parameter that can be chosen sufficiently 
large to yield near-optimal performance. The tradeoff of choos- 
ing large V values is the amount of time required to meet 
the time average constraints. In this paper we also propose 
a proportional delay fairness criterion, in the same spirit 
as the well-known rate proportional fairness fT6) or utility 
proportional fairness fT7) , and show that the corresponding 
delay objective functions are quadratic. Overall, since our 
policies use dynamic c/i rules with weights of simple updates, 
and require limited statistical knowledge, they scale gracefully 
with the number of job classes and are suitable for online 
implementation. 

In the literature, work [18| characterizes multi-class G/M/c 
queues that have polymatroidal performance regions, and 
provides two numerical methods to minimize a separable 
convex function of average delays as an unconstrained static 
optimization problem. But in p"8) it is unclear how to control 
the queueing system to achieve the optimal performance. 



Minimizing a convex holding cost in a single-server multi- 
class queue is formulated as a restless bandit problem in 1 19 1, 
p0[ , and Whittle's index policies pT) are constructed as a 
heuristic solution. Work [22 1 proposes a generalized c/i rule 
to maximize a convex holding cost over a finite horizon in 
a multi-class queue, and shows it is asymptotically optimal 
under heavy traffic. This paper provides a dynamic control 
algorithm for the minimization of convex functions of average 
delays. Especially, we consider additional time average power 
and delay constraints, and our solutions require limited statis- 
tical knowledge and have provable near-optimal performance. 

This paper also applies to power-aware scheduling problems 
in computer systems. These problems are widely studied in 
different contexts, where two main analytical tools are com- 
petitive analysis p3|-p7) and M/G/l-type queueing theory 
(see [28] and references therein), both used to optimize metrics 
such as a weighted sum of average power and delay. This paper 
presents a fundamentally different approach for more directed 
control over average power and delays, and considers a multi- 
class setup with time average constraints. 

In the rest of the paper, the detailed queueing model is given 
in Section |H] followed by a summary of useful M/G/l prop- 
erties in Section III The four delay-power control problems 
are solved in Section IV]|VII followed by simulation results. 



II. Queueing Model 

We only consider queueing delay, not system delay (queue- 
ing plus service) in this paper. System delay can be eas- 
ily incorporated since, in a nonpreemptive system, average 
queueing and system delay differ only by a constant (the 
average service time). We will use "delay" and "queueing 
delay" interchangeably in the rest of the paper. 

Consider a single-server queueing system processing jobs 
categorized into N classes. In each class n G {1, 2, . . . , N}, 
jobs arrive as a Poisson process with rate A„. Each class n job 
has size S n . We assume S n is i.i.d. in each class, independent 
across classes, and that the first four moments of S n are finite 
for all classes n. The system processes arrivals nonpreemp- 
tively with instantaneous service rate p(P(t)), where p(-) 
is a concave, continuous, and nondecreasing function of the 
allocated power P(t) (the concavity of rate-power relationship 
is observed in computer systems [28 1-[30|). Within each class, 
arrivals are served in a first-in-first-out fashion. We consider 
a frame-based system, where each frame consists of an idle 
period and the following busy period. Let tk be the start of 
the fcth frame for each k e Z + ; the kth frame is [tk,tk+i)- 
Define to = and assume the system is initially empty. Define 
Tk — — tk a $ the size of frame k. Let A n k denote the 
set of class n arrivals in frame k. For each job i 6 A n: k, let 

(i) 

Vvj L denote its queueing delay. 

The control over this queueing system is power allocations 
and job scheduling across all classes. We restrict to the 
following frame-based policies that are both causal and work- 
conserving^ 

1 Causality means that every control decision depends only on the current 
and past states of the system; work-conserving means that the server is never 
idle when there is still work to do. 
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In every frame k 6 Z + , use a fixed power level 
Pk G [-Pmin, -fmax] and a nonpreemptive strict priority 
policy 7r(fc) for the duration of the busy period in 
that frame. The decisions are possibly random. 
In these policies, P max denotes the maximum power allocation. 
We assume P max is finite, but sufficiently large to ensure 
feasibility of the desired delay constraints. The minimum 
power P m ; n is chosen to be large enough so that the queue 
is stable even if power P m j n is used for all time. In particular, 
for stability we need 
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E A »4|H < 1 »( P ™) > E A " E P»] ■ 
i /A -'mini i 

The strict priority rule ir(k) = (7r n (fc))^L 1 is represented by 
a permutation of {1, . . . , N}, where class 7r„(fc) gets the nth 
highest priority. 

The motivation of focusing on the above frame-based poli- 
cies is to simplify the control of the queueing system to 
achieve complex performance objectives. Simulations in |31|, 
however, suggest that this method may incur higher variance 
in performance than policies that take control actions based 
on job occupancies in the queue. Yet, job-level scheduling 
seems difficult to attack problems considered in this paper. It 
may involve solving high-dimensional (partially observable) 
Markov decision processes with time average power and delay 
constraints and convex holding costs. 

A. Definition of Average Delay 

The average delay under policies we propose later may not 
have well-defined limits. Thus, inspired by [13], we define 
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as the average delay of class n e {1, . . . ,N}, where |A n) fe| 
is the number of class n arrivals during frame k. We only 
consider delay sampled at frame boundaries for simplicity. To 
verify ([TJ, note that the running average delay of class n jobs 
up to time tjc is equal to 
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If both limits and a™ exist, the ratio /a™ is the limiting 
average delay for class n. In this case, we get 
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which shows W n is indeed the limiting average delay^ The 
definition in ([TJ replaces lim by lim sup to guarantee it is 
well-defined. 

III. Preliminaries 

This section summarizes useful properties of a nonpreemp- 
tive multi-class M/G/l queue. Here we assume a fixed power 
allocation P and a fixed service rate fJ,(P) (this is extended 
in Section 
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Let X n = S n /fi{P) be the service time of a 
class n job. Define p n = A„E [X n ]. Fix an arrival rate vector 
{K)n=i satisfying J2n=i Pn < l '- the rate vector (A„)^ =1 is 
supportable in the queueing network. 

For each k £ Z + , let Ik and denote the fcth idle and 
busy period, respectively; the frame size Tk = Ik + Bk- The 
distribution of (and Tf.) is fixed under any work-conserving 
policy, since the sample path of unfinished work in the system 
is independent of scheduling policies. Due to the memoryless 
property of Poisson arrivals, we have E [If.] = l/(Xm=i A ») 
for all k. For the same reason, the system renews itself at 
the start of each frame. Consequently, the frame size Tk, busy 
period Bk, and the per-frame job arrivals |-An,k| of class n, are 
all i.i.d. over k. Using renewal reward theory J32[ with renewal 
epochs defined at frame boundaries {£fc}£L , we have: 

E [J fc ] 1 



E [T fc ] = 

E[u n , fc r 



1 

1 - L„=l Pn 

= A n E[T fc ], Vne {!,. 



(1 - Hn=l Pn) Z)n=l A ™ 



,JV}, Vfc e 



(3) 
(4) 



It is useful to consider the randomized policy 7r lan( j that is 
defined by a given probability distribution over all possible TV! 
priority orderings. Specifically, policy 7r ranc j randomly selects 
priorities at the beginning of every new frame according to this 
distribution, and implements the corresponding nonpreemptive 
priority rule for the duration of the frame. Again by renewal 
reward theory, the average queueing delays (W n )^ =1 rendered 
by a 7r ran( j policy satisfy in each frame k E Z + : 



E 



E 



(i) 



n . k 



E 



,(f) dt 



X n W n E [T k ] 



where we recall that \ represents only the queueing delay 
(not including service time), and Q n {t) denotes the number 
of class n jobs waiting in the queue (not including that in the 
server) at time t. 

Next we summarize useful properties of the performance 
region of average queueing delay vectors {W n )^ =1 in a 
nonpreemptive multi-class M/G/l queue. For these results 
we refer readers to (TJ, (5), [33| for a detailed introduction. 
Define the value x n = p n W n for each class n£ {1, . . . , N}, 
and denote by fl the performance region of the vector (x n )^ =1 . 
The set SI is a special polytope called (a base of) a polyma- 
troid [34). An important property of the polymatroid 17 is: 



(5) 



2 The second equality in j2J, where we pass the limit into the expectation, 
can be proved by a generalized Lebesgue's dominated convergence theorem 
stated as follows. Let {X„}™_j and {Y n }^L 1 be two sequences of random 
variables such that: (1) < \X n \ < Y n with probability 1 for all n; 
(2) For some random variables X and Y, X n — > X and Y n — > Y with 
probability 1; (3) lim^oo E [Y n ] = E [Y] < oo. Then E [X] is finite and 
liirin—joo E [X n ] = E [X]. The details are omitted for brevity. 
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(1) Each vertex of £1 is the performance vector of a strict 
nonpreemptive priority rule; (2) Conversely, the performance 
vector of each strict nonpreemptive priority rule is a vertex 
of ft. In other words, there is a one-to-one mapping between 
vertices of and the set of strict nonpreemptive priority rules. 
As a result, every feasible performance vector (x n )^ =1 £ fl, 
or equivalently every feasible queueing delay vector (W n )^ =1 , 
is attained by a randomization of strict nonpreemptive priority 
policies. For completeness, we formalize the last known result 
in the next lemma. 

Lemma 1. In a nonpreemptive multi-class M/G/l queue, 
define 

as the performance region ^5^ of average queueing delays. 
Then: 

1) The performance vector (W n )^ =1 of each frame-based 
randomized policy TT ram i is in the delay region W. 

2) Conversely, every vector (W n )n=i !n tne delay region 
W is the performance vector of a Tr ranc i policy. 

Proof of Lemma [7} Given in Appendix [A] ■ 
Optimizing a linear function over the polymatroidal region 
fl will be useful. The solution is the following c[i rule: 

Lemma 2 (The cp rule JT), (33)). In a nonpreemptive multi- 
class M/G/l queue, define x n = p n W n and consider the 
linear program: 



N 



£■ 

n=l 

subject to: (x n )^ =1 € ft 



minimize: > 

n=l 
\N 



(6) 
(7) 



where c n are nonnegative constants. We assume E^-i Pn < 1 
for stability, and that second moments E \X%\ of service times 
are finite for all classes n. The optimal solution to |6|-(|7| is a 
strict nonpreemptive priority policy that assigns priorities in 
the decreasing order of c n . That says, if C\ > C2 > ■ ■ ■ > Cjv, 
then class 1 gets the highest priority, class 2 gets the second 
highest priority, and so on. In this case, the optimal average 
queueing delay W n of class n is 



W„ = 



R 



where p = and R = \ E^=i \^n\- 

IV. Achieving Delay Constraints 

The first problem we consider is to construct a frame-based 
policy that yields average delays satisfying W n < d n for all 
classes n £ {1, . . . , TV}, where d n > are given constants. We 
assume a fixed power allocation and that the delay constraints 
are feasible. 

Our solution relies on tracking the running difference be- 
tween past queueing delays for each class n and the desired 
delay bound d n . For each class n £ {1, . . . , N}, we define a 



discrete-time virtual delay queue {Z ny k}kL where Z n ^+i is 
updated at frame boundary tk+i following the equation 



■>n,k+l 



. 



(8) 



Assume Z n>0 = for all n. In ([8]), the delays and 
constant d n can viewed as arrivals and service of the queue 
{Z n ,k}kLo, respectively. If this queue is stabilized, we know 
that the average arrival rate to the queue (being the per-frame 
average sum of class n delays J2n£A k ^nfc) ^ s ^ ess ^ an or 
equal to the average service rate (being the value d n multiplied 
by the average number of class n arrivals per frame), from 
which we infer W n < d n . This is formalized below. 

Definition 1. We say queue {Z n j c }'kLQ is mean rate stable if 
lim^oo E \Z n , K \ /K = 0. 

Lemma 3. If queue {Z nt k}k° = o ' s mean r <*te stable, then 
W n <d n . 



Proof of Lemma^j From <(8j we get 

Z n ,k+i > Z nt k — d n |A nj fe| 



ieA n , k 



Summing the above over k £ {0, 
and taking expectation yields 



. ,K-1}, using Z n X) = 0, 





~K-l 




E [Z n>K ] > -d n E 




+ E 




.fc=0 





K-l 



E E 

fc=0 i£A„i 



W, 



(i) 



Dividing the above by E j^EfcLo 1 l^-n.fcl] yields 

E 



E[Z n 



K 



E 



E 



> 



n . k 



^fc=0 

Taking a Km sup as K — > 
W n < d n + lim sup 

K — >oo 



Efc=o IA»,fc| 



oo and using ([TJ yields 

E [Z ntK ] K 



K 



E 



Using E O^fcl] = A„E [T fc ] > A„E [I h ] = A„E [J ], we get 

— 1 E [Z n K ] 

W n <d n + lim ^ — = d,, 

A„E [I \ k^c 

by mean rate stability of Z n 



K 



A. Delay Feasible Policy 

The following policy stabilizes every {Z ny k}^ =0 queue in 
the mean rate stable sense and thus achieves W n < d n for all 
classes n. 

Delay Feasible (DelayFeas) Policy: 

• In every frame k £ Z + , update Z n ,k by ([8]) and serve 
jobs using nonpreemptive strict priorities assigned in the 
decreasing order of Z n y, ties are broken arbitrarily. 
We note that the DelayFeas policy does not require any statis- 
tical knowledge of job arrivals and service times. Intuitively, 
each Z, l: k queue tracks the amount of past queueing delays 
in class n exceeding the desired delay bound d n (see l[8]l), 
and the DelayFeas policy gives priorities to classes that more 
severely violate their delay constraints. 
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B. Motivation of the DelayFeas Policy 

The structure of the DelayFeas policy follows a Lyapunov 
drift argument. Define vector Z k = (Z n ^)n=i- F° r some finite 
constants 6 n > for all classes n, we define the quadratic 
Lyapunov function 

N 



L(z k )^lJ2e n z 2 n 



as a weighted scalar measure of queue sizes (Z n ^)n=i- Define 
the one-frame Lyapunov drift 

A(Z fc ) ±E{L(Z k+1 ) - L(Z k ) | Z k ] 

as the conditional expected difference of L(Z k ) over a frame. 
Taking square of <|8j and using (max[a,0]) 2 < a 2 yields 

2 



Z 



71,/C + 1 



< 



z 



n.k 



E ( 



(9) 



Multiplying |9]) by 6> n /2, summing over n G {1, . . . , -/V}, and 
taking conditional expectation on Zfc, we get 

(\ 2 



n=l 
N 



e Ki-*0) 



n=l 



E «l - d «) i z 



(10) 

Lemma [7] in Appendix [B] shows that the second term of ( |T0] > is 
bounded by a finite constant C > 0. It leads to the following 
Lyapunov drift inequality: 



JY 



A(Z fe ) <C + J2°n Z n , k E 



E (<! - i * 



(11) 

Over all frame-based policies, we are interested in the one 
that, in each frame k after observing Z k , minimizes the right 
side of ( fTT| . Recall that our policy on frame k chooses which 
nonpreemptive priorities to use during the frame. To show that 
this is exactly the DelayFeas policy, we simplify ( fTTj ). Under 
a frame-based policy, we have by renewal reward theory 



E 



E W ni I * 



X n W„ k E [T fc ] 



where W n . k denotes the long-term average delay of class n 
if the control in frame k is repeated in every frame. Together 
with E [|A n .fe|] = A„E [T k ], inequality (JTTJ is re-written as 



A(Z fc ) < — E [T fe ] E °n Z n>k X n dA 



N 



(12) 



+ E [T fc ] E XnW 



n=l 



Because in this section we do not have dynamic power 
allocation (so that power is fixed to the same value in every 



busy period), the value E [T k ] is the same for all job scheduling 
policies. Then our desired policy, in every frame k, chooses a 
job scheduling to minimize the metric Yln=i ® n ^n,k A n W ni t 
over all feasible delay vectors (W n ,k)n=i- If we choose 
8 n = E [X n ] for all classes nj^] the desired policy minimizes 
J2n=i Z n,k A„ E [X n ] W n ^ k in every frame k. From lemma [i] 
this is achieved by the priority service rule defined by the 
DelayFeas policy. 



C. Performance of the DelayFeas Policy 

Theorem 1. For every collection of feasible delay bounds 
{di, . . . , d/v}, the DelayFeas policy yields average delays 
satisfying W n < d n for all classes n € {1, . . . , N}. 

Proof of Theorem U\ It suffices to show that the 
DelayFeas policy yields mean rate stability for all Z n>k queues 
by Lemma [3] By Lemma [T| there exists a randomized priority 
policy 7r* d (introduced in Section III > that yields average 
delays W n satisfying W n < d n for all classes n. Since the 
DelayFeas policy minimizes the last term of ( ff2] > in each frame 
(under 9 n = E [X n ] for all n), comparing the DelayFeas policy 
with the 7r* and policy yields, in every frame k, 



N N 

l n Z n>k A„W^ ayFeaS <E^ Z nM KW* n . 
n— 1 n— 1 



E 1 



It follows that ( p"2| ) under the DelayFeas policy is further upper 
bounded by 



N 



A(Z fe ) < C + E [T fc ] ]T e n Z n . k \ n {W n ] 



DelayFeas 



d n ) 



n=l 
N 



<C + E[T k }J2 On Z n ,k K(W* n - d n ) < C. 
n=l 

Taking expectation, summing over fc G {0, . . . , K — 1}, and 
noting L(Zq) = 0, we get 

1 N 

E[L(Z K )] = -J2^[Zl K ] <kc. 

n=l 

It follows that E [Zl K ] < 2KC/9 n for all classes n. Since 
Z n ,K > 0, we get 



0<E[Z„ iK ] < WE 



,K 



Dividing the above by K and passing K — > oo yields 

E [Z n>K ] 



lim 

K— >oc 



K 



= 0, Vne {!,..., N}, 



and all Z Uik queues are mean rate stable. 



3 We note that the mean service time E [X n ] as a value of 8 n is only needed 
in the arguments constructing the DelayFeas policy. The DelayFeas policy 
itself does not need the knowledge of E [X n ], 
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V. Minimizing Delay Penalty Functions 

Generalizing the first delay feasibility problem, next we 
optimize a separable penalty function of average delays. For 
each class n, let /«(•) be a nondecreasing, nonnegative, con- 
tinuous, and convex function of average delay W n . Consider 
the constrained penalty minimization problem 



JV 



minimize: ^f n iW n ) 

n=l 

subject to: W n < d n , Vn e {1, . . . . N}. 



(13) 
(14) 



We assume that a constant power is allocated in all frames, 
and that constraints ( fl4| > are feasible. The goal is to construct 
a frame-based policy that solves ([T3j)-([T4"1). Let (W n )^ =1 be 
the optimal solution to (fL^-(fT4"|), attained by a randomized 
priority policy 7r* and (by Lemma [TJ. 

A. Delay Proportional Fairness 

One interesting penalty function is the one that attains 
proportional fairness. We say a delay vector (W n )^ =1 is 
delay proportional fair if it is optimal under the quadratic 
penalty function f n (W n ) = \c n W n for each class n, where 
c„ > are given constants. The intuition is two-fold. First, 
under the quadratic penalty functions, any feasible delay vector 
(W n )„—i necessarily satisfies 

N 

E fn( W n)(Wn ~ W* n ) = £ C n ( W n ~ W* n )W* n > 0, 
ri— 1 n—1 

(15) 

which is analogous to the rate proportional fair [16] criterion 



A? 

E< 

n=l 



<o, 



(16) 



where (x n )n=\ is any feasible rate vector and (;r*)„ =1 is 
the optimal rate vector. Second, rate proportional fairness, 
when deviating from the optimal solution, yields the aggre- 
gate change of proportional rates less than or equal to zero 
(see (fl6l>); it penalizes large rates to increase. When delay 
proportional fairness deviates from the optimal solution, the 
aggregate change of proportional delays is always nonnegative 
(see (fl5]l); small delays are penalized for trying to improve. 



B. Delay Fairness Policy 

In addition to having the {Z n ^y^_ Q queues updated by ([8]) 
for all classes n, we setup new discrete-time virtual queues 
{^n,fe}fcLo f° r a ^ c l asses n > where Y n ,k+i is updated at frame 
boundary t^+i as: 



Y„ 



k+l 



max 



Y n ^ k 



E 



«1 



r n ,k 



.0 



(17) 



where r n> k G [0, d n ] are auxiliary variables chosen at time t k 
independent of frame size T k and the number |j4 raj fc| of class 
n arrivals in frame fc. Assume Y n $ = for all n. Whereas the 
Z n>k queues are useful to enforce delay constraints W n < d n 



(as seen in Section |IVb, the Y n k queues are useful to achieve 



the optimal delay vector {W n )^ =1 . 
Delay Fairness (DelayFair) Policy: 

1) In the fcth frame for each k £ Z + , after observing Zk 
and Yfc, use nonpreemptive strict priorities assigned in 
the decreasing order of (Z n k + Y n ^)/E[S„], where 
E [S n ] is the mean size of a class n job. Ties are broken 
arbitrarily. 

2) At the end of the fcth frame, compute Z n ,k+i and Y n ^+i 
for all classes n by ^ and ( fT7j >, respectively, where r n ,k 
is the solution to the convex program: 

minimize: V f n (r n:k ) - F n , fe A 
subject to: < r n % < d n , 

where V > is a predefined control parameter. 



While the DelayFeas policy in Section IV does not require 
any statistical knowledge of arrivals and service times, the 
DelayFair policy needs the mean but not higher moments of 
arrivals and service times for all classes n. 

In the example of delay proportional fairness with quadratic 
penalty functions f n (W n ) = \c n W n for all classes n, the 
second step of the DelayFair policy solves: 



minimize: 



2^ c n ) r ti,k — ^n,k r n,k 



subject to: < r n k < d n 



The solution is r* k = min 



dn > 



Vc„ 



C. Motivation of the DelayFair Policy 

The DelayFair policy f ollow s a Lyapunov drift argument 

Define Z k = {Z n ,k)%=\ and 



similar to that in Section 



IV 



Yk = (Y n: k)n = i- Define the Lyapunov function L{Zk,Yk) = 
I Yln=i( Z n.k + Y n,k) and tne one-frame Lyapunov drift 



A(Z k ,Y k ) ±E[L(Z k+1 ,Y k+1 ) 
Taking square of {Tf\ yields 



L{Z k ,Yk) | Z k ,Y k ]. 



Y 



n,k+l 



< 



Y„ 



E 



(w (l 

V n, 



■ (18) 

Summing Q and ( fT8] l over all classes n € {l,...,iV"}, 
dividing the result by 2, and taking conditional expectation 
on Z k and Y k , we get 



N 

A(Z kl Y k ) <C-E^ fcd ™ E [l A '^l 

n=l 



Zk,Yk] 



N 

E 

n=l 
N 



Y n , k E[r n , k \A n ,k\ | Zk,Y k 



(19) 



E(^,fc + y n,fe)E 



n=l 



E w, 



k 



Z k , Y k 



where C > is a finite constant, different from that used 



in Section IV-B upper bounding the sum of all (Z k ,Y k )- 



independent terms. This constant exists using arguments sim- 
ilar to those in Lemma [7] of Appendix [B] 
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Adding to both sides of ( fT9| i the weighted penalty term 
VY,n=i E [fn(r n ,k)T k I Z k ,Y k ], where V > is a pre- 
defined control parameter, and evaluating the result under a 



N 



frame-based policy (similar as the analysis in Section IV-Ci, 
we get the following Lyapunov drift plus penalty inequality: 



N 



A(Z k , Y k ) + VJ2 E [fn{r n ,k) T k | Z k , Y k ] 

n=l 

<(c-E [T fe ] Z n,k A 

JV 

+ E[T k ]^E[V fn(r n ,k) -Y n>k X 

nTn,k \ ^kt * k\ 

n=l 
N 

+ E [T k ] J2( Z n,k + Y n . k )\ n Wn,k. 



(20) 



We are interested in minimizing the right side of p0[ ) in every 
frame k over all frame-based policies and (possibly random) 
choices of r nik . Recall that in this section a constant power is 
allocated in all frames so that the value E [T k ] is fixed under 
any work-conserving policy. The first and second step of the 
DelayFair policy minimizes the last (by Lemma |2]l and the 
second-to-last term of (|20]>, respectively. 



D. Performance of the DelayFair Policy 

Theorem 2. Given any feasible delay bounds {d\, . . . , d/v}, 
the DelayFair policy yields average delays satisfying W n < d n 
for all classes n £ {1, . . . , AT}, and attains average delay 
penalty satisfying 



N 

lim sup ^ /„ 

K— >co 



E 



Efe=0 Si 



W, 



< 



V 



E [Ek=o\Ar, 

N 



where V > is a predefined control parameter and C > 
a finite constant. By choosing V sufficiently large, we attain 
arbitrarily close to the optimal delay penalty X^n=l fn{W n ). 

We remark that the tradeoff of choosing a large V value 
is the amount of time required for virtual queues {Z n .fc}^ 
and {Y n k } k *' = r ) to approach mean rate stability (see ( |23j ) in 
the next proof), that is, the time required for the virtual queue 
backlogs to be negligible with respect to the time horizon. 

Proof of Theorem |2| Consider the optimal randomized 
policy 7r* and that yields optimal delays W n < d n for all classes 
n. Since the DelayFair policy minimizes the right side of d20|, 



+ E [T k ] £ (Vf n {W* n ) - Y n>k X n W* n 

N 

<c + VE[T k ]J2fn(W* n ). 



(21) 



Removing the second term of ( f2"T] i yields 



A(Z k , Yfc) < C + VE [T k ] £ /„ (W* n ) <C + VD, (22) 



where D = E [T k ] X} n =i fn(W n ) is a finite constant. Taking 
expectation of ( |22] i, summing over k £ {0, . . . , K — 1}, and 
noting L(Z , Y ) = yields E [L(Z K , Y K )} < K(C + VD). 
It follows that, for each class n queue fc}^ , we have 



E \Z n u] 
< 1 - ' 1 < 



K 



\ 



E 



7 2 



K 2 



(23) 



2E [L(Z k ,Y K )] < j2C , 2VD 



K 2 - V K ' K 

Passing K — > oo proves that queue {^ n .fc}^L * s mean rate 
stable for all classes n. Thus constraints W n < d n are satisfied 
by Lemma [3] Similarly, the {Y n , k } k L queues are mean rate 
stable for all classes n. 

Next, taking expectation of pTj ), summing over k £ 
{0, . . . , K - 1}, dividing by V, and noting L(Z , Y ) = 
yields 



E[L(Z K ,Y K )] 
V 



N 



KC 

<^t + e 



K-l 



E E 

n=l 
"A'-l 

E T * 



E fn(r n ,k)T k 

k=0 
N 



fe=0 



n=l 



Removing the first term and dividing by E j^fcLo 1 

KC 



^ E [EkJ fn(r n ,k)T k 



E 



< 



V"E 



<^f^+E/n(w;), 



yields 

E^TO 

n=l 

(24) 



where (a) follows E [T k ] > E[I k ] = l/(E»=iAn). By (14 
Lemma 7.6] and convexity of /„(■), we get 

E 

n— 1 

(25) 

Combining (|24]k|25]i and taking a lim sup as X — > oo yields 

N (E 
Tjn\ 

K^roo 




comparing the DelayFair policy with the policy 7r* and and with 
the genie decision r* fc = W n for all classes n and frames 
k, inequality ( |20] > under the DelayFair policy is further upper 
bounded by 

TV 

A(Z k , Y k ) + V E E [/n(r„ )fc ) Tfc | Z fc , F fc ] 
n=l 

AT TV 

< C - E [T fe ] E ^ A„ d„ + E [Tfe] E(^n,fe + ^n,fc)A„W; 

ri— 1 n—1 





m] 


E 


Z^fe=0 J fc 





E 


Z^fc=0 r n,kJ-k 


E 


Z^fe=o ifc 





lim sup E/" 



^if-l rp 

2^fe=0 r n, k -L k 



n=l 



E 



[Z^fe=o 



T fc 



^.v^^ \ N 



The next lemma, proved in Appendix |C] completes the proof. 
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Lemma 4. If queues {^.fcjfcLo are mean rate stable for all 
classes n, then 



N 



lim sup ^2 fn 



E 



Z^fc=0 l^ieA n:k yv n,k 



E 



J2k=0 \An,k\ 



N 



< lim sup ^ fn 



E 


\^K-l rp ' 

Z^fc=0 r n,k 1 k 


E 


'■^K-l rp ' 





n=l 



VI. Delay-Constrained Optimal Power Control 

In this section we incorporate dynamic power control into 
the queueing system. As mentioned in Section [IT] we fo- 
cus on frame-based policies that allocate a constant power 
Pk £ [-Pminj -Pmax] over the duration of the fcth busy period 
(we assume zero power is allocated when the system is idle). 
Here, interesting quantities such as frame size Tf., busy period 
Bk, the set A n> k of per-frame class n arrivals, and queueing 
delay are all functions of power Pk . Similar to the delay 
definition (TXT), we define the average power consumption 



P 



lim sup 



E 


Ek=oPkBk(Pk 


) 


E 


J2k=0 T k( P k) 





(26) 



where Bk(Pk) and Tk(Pk) emphasize the power dependence 
of B k and T k . It is easy to show that both B k (Pk) and T k (Pk) 
are decreasing in Pk . The goal is to solve the delay-constrained 
power minimization problem: 

minimize: P (27) 

subject to: W n < d„, Vn G {1, .. ., N} (28) 

over frame-based power control and nonpreemptive priority 
policies. 

A. Power-Delay Performance Region 

Every frame-based power control and nonpreemptive prior- 
ity policy can be viewed as a timing sharing or randomiza- 
tion of stationary policies that make the same deterministic 
decision in every frame. Using this point of view, next we 
give an example of the joint power-delay performance region 
resulting from frame-based policies. Consider a two-class 
nonpreemptive M/G/l queue with parameters: 

. Ai = 1, A 2 = 2, E [Si] = E [5 a ] = E [Sf] = 1, 
E [Si] = 2. (jl(P) = P. For each class n G {1,2}, 
the service time X n has mean E [X n ] = E [S n ] / P and 
second moments E [X%] = E [S%] /P 2 - For stability, we 
must have AiE [X{\ + A 2 E [X 2 ] < 1 P > 3. In this 
example, let [4, 10] be the feasible power region. 
Under a constant power allocation P, let W(P) denote the 
set of achievable queueing delay vectors (Vt^i,!^)- Define 
p n 4 A„E [X n ] and R 4 \ Y? n=1 A„E [X%] . Then we have 

— ,ne{l,2}l 



W(P) = { (W U W 2 ) 



w n > 



^2 P^ W n = 



(Pi + P2)R 
1 - pl - P2 



The inequalities in W(P) show that the minimum delay for 
each class is attained when it has priority over the other. The 
equality in W(P) follows the M/G/l conservation law |35) . 
Using the above parameters, we get 



W(P) = < 



(W 1 ,W 2 ) 



Wi > 



W 2 > 



P(P-l) 



P(P 

Wi + 2W 2 = 



P(P-3) 

Fig.[T|shows the collection of delay regions W(P) for different 
values of P € [4, 10]. This joint region contains all feasible 
delay vectors under constant power allocations. Fig. [2] shows 




Fig. 1. The collection of average delay regions W(P) for different power 
levels P 6 [4,10]. 




(29) 



Fig. 2. The augmented performance region of power-delay vectors 

{P,Wi(P),W 2 (P)). 

the associated augmented performance region of power-delay 
vectors (P, Wi(P), W 2 (P))', its projection onto the delay 
plane is Fig. [T] After timing sharing or randomization, the 
performance region of all frame-based power control and 
nonpreemptive priority policies is the convex hull of Fig. [2] 
The problem (|2"7j)-(|2"8|) is viewed a stochastic optimization over 
such a convexified power-delay performance region. 

B. Dynamic Power Control Policy 

We setup the same virtual delay queues Z n ^ as in (|SJ, 
and assume Z n< o — for all classes n. We represent a strict 
nonpreemptive priority policy by a permutation (% n ) 



N of 

71=1 OI 
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{1, . . . , N}, where 7r„ denotes the job class that gets the nth 
highest priority. 

Dynamic Power Control fDynPowerj Policy: 

1) In the fcth frame for each k € Z + , use the nonpreemptive 
strict priority rule {itn)n=i that assigns priorities in 
the decreasing order of Z n k /E[S n ]\ ties are broken 
arbitrarily. 

2) Allocate a fixed power P k in frame k, where P k is the 
solution to the following minimization of a weighted 
sum of power and average delays: 



/ n \ 
minimize: j V A n E [S n ] J 

V n=l J 



Pk 



+ J2 z -^K n w Wn (P k ) 



(30) 



(31) 



subject to: P k € [P min , Aiax]- 

The value W^^Pk), given later in (37) , is the average 

delay of class 7r„ under the priority rule {ir n )n=i and 

power allocation P k . 
3) Update queues Z nk for all classes n e {1, ...,N} 

by |8]l at every frame boundary. 
The above Dyn Power policy requires the knowledge of arrival 
rates and the first two moments of job sizes for all classes 
n (see ([37)). We can remove its dependence on the second 
moments of job sizes, so that it only depends on the mean of 
arrivals and job sizes; see Appendix [D] for details. 

C. Motivation of the Dyn Power Policy 

We construct the Lyapunov drift argument. Define the 
Lyapunov function L(Z k ) = \ Yl n =i Z n k anc ^ trie one -f rame 
Lyapunov drift A(Z fc ) = E [L(Z k+1 ) ~ L(Z k ) | Z k \. Similar 
as the derivation in Section |IV-B| we have the Lyapunov drift 
inequality: 



N 



A(Z fc ) <C + Y,Zn,k 



E 



E 



Z k 



(32) 

Adding the weighted energy VE [P k B k (P k ) \ Z k \ to both 
sides of ((32), where V > is a control parameter, yields 



A(Z fc ) + V^E [P k B k {P k ) \Z k ]<C + $(Z fc ), (33) 



where 



*(Z fe ) 4 E 



VP k B k {P k ) 



N 
n=l 



We are interested in the frame-based policy that, in each frame 
k, allocates power and assigns priorities to minimize the ratio 

v ' (34) 

E[T k (P k )\Z k ] K ' 

Note that frame size T k (P k ) depends on Z k because the power 
allocation that affects T k (P k ) may be Z^-dependent. For any 
given power allocation P k , T k (P k ) is independent of Z k . 



Lemma [5] next shows that the minimizer of ( |34) is a de- 
terministic power allocation and strict nonpreemptive priority 
policy. Specifically, we may consider each p e V in Lemma [5] 
denotes a deterministic power allocation and strict priority 
policy, and random variable P denotes a randomized power 
control and priority policy. 

Lemma 5. Let P be a continuous random variable with state 
space V. Let G and H be two random variables that depend 
on P such that, for each p £ V, G(p) and H(p) are well- 
defined random variables. Define 



p = aigmm pe -p 



g [G(p)\ 
E [H(p)] 



A E[G(p*)} 
E[H(p*)} 



Then > U* regardless of the distribution of P. 

Proof: For each p e V, we have > U*. Then 

E [G] E P [E [G(p)}} ^ E P [U*E [H(p)}} 



> 



E [H] E P [E [H(p)]] ~ E P [E [H{p)\] 



which is independent of the distribution of P. ■ 
Under a fixed power allocation P k and a strict nonpreemp- 
tive priority rule, ((34) is equal to 



VE [P k B k {P k )] + Y,n=l Z n,k K{W n , k {P k ) - d n )E [T k {P k )\ 

E[T k {P k )] 

= Vp J^=^f. [Sn] + Z ^ UWnAPk) - dn), 

/H-nfeJ „ =1 

(35) 

where by renewal theory 

E[B k (P k )\ ^ E[S n ] 

E[T k (P k )] -£t P " W -^ A >(ft) 



n=l 



and power-dependent terms are written as functions of P k . It 
follows that our desired policy in every frame k minimizes 



N 



U^A„E [S n ] 



KPk) 



N 



^Z ntk \ n W ntk (P k ) (36) 



n=l 



over constant power allocations P k G [Pmin, Pnax] and nonpre- 
emptive strict priority rules. 

To further simplify, for each fixed power level P k , by 
Lemma [2j the cu rule that assigns priorities in the decreasing 
order of Z n>k /E[S n ] minimizes the second term of ( |36) 
(note that minimizing a linear function over strict priority 
rules is equivalent to minimizing over all randomized priority 
rules, since a vertex of the performance polytope attains the 
minimum). This strict priority policy is optimal regardless 
of the value of P k , and thus is overall optimal; priority 
assignment and power control are decoupled. We represent the 
optimal priority policy by (TT n )n=i> recalling that 7r n denotes 
the job class that gets the nth highest priority. Under priorities 
{^n)n=l anc ' a fi xe d power allocation P k , the average delay 
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Wn n (Pk) for class 7r„ is equal to 

5 Zti A»E [X*] 



(37) 

where j o 7rm = A^E [5 , 7rm ] if m > 1 and if m = 0. The 
above discussions lead to the Dyn Power policy. 

D. Performance of the Dyn Power Policy 

Theorem 3. Let P* be the optimal average power of the prob- 
lem ( |27[ )-( |28[ ). The Dyn Power policy achieves delay constraints 
W n < d n for all classes n G {1, . . . , N} and attains average 
power P satisfying 



N \ 



V 



p* 



where C > is a finite constant and V > a predefined 
control parameter. 

Proof of Theorem [Ff As discussed in Section |VI-A| the 
power-delay performance region in this problem is spanned by 
stationary power control and nonpreemptive priority policies 
that use the same (possibly random) decision in every frame. 
Let 7r* denote one such policy that yields the optimal average 
power P* with feasible delays W n < d n for all classes n. 
Let P£ be its power allocation in frame k. Since policy it* 
makes i.i.d. decisions over frames, by renewal reward theory 
we have 

np^BiPZ)} 

E[T(P*)} 



Then the ratio 



P* = 



E[T k (p k )\z k ] unc l e r policy tt* (see the left side 
of (|35]l) is equal to 



V 



E[PkB(p:)} 

E[T(P,*)] 



N 

Z n>k A„ (W* n - d n ) < VP* 



Since the Dyn Power policy minimizes ^Tt(P k )\z k ] ovel 
frame-based policies, including the optimal policy ir*, the ratio 
E[rf(iV)|z fc ] un ^ er me Dyn Power policy satisfies 

$(Z k ) 



< VP* =>• *(Z fc ) < VP*E [T k (P k ) | Z fc ] . 



E[T fe (P fe ) | Z fc ] 
Using this bound in ( f3"3"] l yields 

A(Z fc ) + UE [P k B k (P k ) \Z k ]<C + VP* E [T k (P k ) | Z k ] . 

Taking expectation, summing over k € {0, . . . , K — 1}, and 
noting L(Zq) = yields 



E[L(Z K )] + U 



K-l 

E 



< ifC + UP* E 



E[P k B k (P k )} 

K-l 

E T ^^) 



fe=0 



(38) 



Since E[T k (P k )} is decreasing in P k and, under a fixed 
power allocation, is independent of scheduling policies, we 

get E [T k (P k )} < E [T (P min )] and 



K-l 



E[L{Z K )} + Vj2E[P k B k (P k )} 

k=0 

<K(C + VP*E[T (P min )}). 
Removing the second term and dividing by K 2 yields 

E[L(Z K )} < C + VP*E[T (P min )} 



K 2 

Combining it with 



E \Z n K ] 
< 1 ' J < 
K 



K 



\ 



E 



7 2 

Z n,K 



K 2 



< 



2E [L{Z K )\ 



K 2 



and passing K —> oo proves that queue {Z n . k }j? =0 is mean 
rate stable for all classes n. Thus W n < d n for all n by 
Lemma [3] 

Further, removing the first term in ( [38) ) and dividing the 

yields 



result by VE 



Y, k =o T k{Pk) 



E 


E k =oPkB k (P k 


) 


E 


Y, k =0 T k{Pk) 





< 



c 



K 



V 



E 



EtoTkiPk) 



^ ^ L-in=l n _|_ p* 



V 



where (a) uses E [T k (P k )} >E[I k ] = 1/(£^ =1 A n ). Passing 
_ftT — > oo completes the proof. ■ 

VII. Optimizing Delay Penalties with Average 
Power Constraint 

The fourth problem we consider is to, over frame-based 
power control and nonpreemptive priority policies, minimize 
a separable convex function of delay vectors (W n )n=i subject 
to an average power constraint: 



minimize: 



E. 

71=1 



fn(W n ) 

subject to: P < P const . 



(39) 
(40) 



The value P is defined in ( |26| > and P const > is a given 
feasible bound. The penalty functions /„(•) are assumed 
nondecreasing, nonnegative, continuous, and convex for all 
classes n. Power allocation in every busy period takes values 
in [Pmin, Pnax], and no power is allocated when the system is 
idle. In this problem, the region of feasible power-delay vec- 
tors (P, W\, . . . , Wn) is complicated because feasible delays 
(yV n )n=i are indirectly decided by the power constraint (40) . 
Using the same methodology as in the previous three prob- 
lems, we construct a frame-based policy to solve ((39)-(|40). 

We setup the virtual delay queue {Y n k }^ for each class 
n G {1, . . . , N} as in (17) , in which the auxiliary variable 
r Mj fc takes values in [0, P™ ax ] for some P™ ax > sufficiently 
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large Define the discrete-time virtual power queue {Xj~}]? = q 
that evolves at frame boundaries {ifc}^L as 



where 



JY 



X k+1 = max [X k + P k B k {P k ) - P const T fc (P fc ), 0] . (41) *(Xfc) = E PUfib) I Xk] E Y ^ KW n , k (P k ) 



Assume X = 0. The {X^^Lq queue helps to achieve the 
power constraint P < P CO nst- 

Lemma 6. If the virtual power queue {X^^q is mean rate 
stable, then P < P comt . 

Proof: Given in Appendix [E] ■ 

A. Power-constrained Delay Fairness Policy 

Power-constrained Delay Fairness (PwDelayFair) Policy: 
In the busy period of each frame k 6 Z + , after observing 
X k and (r n , fc )n=i: 

1) Use the nonpreemptive strict priority rule (Tr n )n=i mat 
assigns priorities in the decreasing order of Y n ^ k /E [S„]; 
ties are broken arbitrarily. 

2) Allocate power P k for the duration of the busy period, 
where P k solves: 



minimize: X k 



-Pc, 



Pk 



N 



J2 A » E 



N 



+ J2 Y ^x, n w 7r jp k ) 

subject to: P k £ [P min ,P max ], 

where WT Trl (P k ) is defined in ( |37) . 
3) Update X k and Y n k for all classes n at every frame 
boundary by ( pT} and ( fT7| ), respectively. In dT7j, the 
auxiliary variable r n >k is the solution to 

minimize: V f n (r n , k ) - Y n . k X n r Uyk 
subject to: < r nM < R™™. 

B. Motivation of the PwDelayFair Policy 

The construction of the Lyapunov drift argument follows 
closely with those in the previous problems; details are omitted 
for brevity. Define vector \k — [X k ; Yi tk , . . . , Yj^ tk ], the Lya- 
punov function L(xk) — \{ x k + S n =i Y n,k)> and the one " 
frame Lyapunov drift A(xk) = E [L(xk+i) - L(Xk) I Xk]- 
We can show there exists a finite constant C > such that 



A(xfc) < C + X k E[P k B k (P k [ 
, Y n . k E 



PconstT k (P k ) | Xk] 



N 

E 

n=l 



E 



u. 



(i) 



n . k 



f n ,kj I Xfe 



(42) 



Adding the term ^E»=iE [fn{r n , k ) T k (P k ) \ X k] to both 
sides of ( |42| ), where 1/ > is a control parameter, and 
evaluating the result under a frame-based policy yields 



AT 

■V£E[/ n (r n , fc )T fc (P fc ) |xfc] <C + ^( Xk ), (43) 
n=l 



4 For each class n, we need -R™ to be larger than the optimal delay W* n 
in problem j39j-{4flj. One way is to let -R™ ax be the maximum average delay 
over all classes under the minimum power allocation P^ n . 



+ X k E [P k B k (P k ) | xk] - X k P cmst E[T k (P k 



Xk\ 



N 



+ E [T k (P k ) | Xfe] E E Wfni^k) - Yn.k A I Xk] , 

n=l 

where W n ,k(Pk) is the average delay of class n if the control 
and power allocation in frame k is repeated in every frame. 
We are interested in the frame-based policy that minimizes 



the ratio 



in each frame k £ Z + . Lemma 5 shows 
the minimizer is a deterministic policy, under which the ratio 



nT k (p k )\ Xk ] 

is a di 

is equal to 



A' 



J2 Y ^KW nik {P k )+X k (P k (Pk) - Pc 

71=1 

N 

+ ^2 (Vf n (r n ,k) - Y ntk \ n r„ jfe ) , 



n=l 



where p sum {P k ) = S„ = i A„E [5„] //x(Pfc). Und er sim ilar 
simplifications as the Dyn Power policy in Section VI-B we 



can show that the PwDelayFair policy is the desired policy. 

C. Performance of the PwDelayFair Policy 

Theorem 4. For any feasible average power constraint P < 
Pconst, the PwDelayFair policy satisfies P < P comt and yields 
average delay penalty satisfying 



N 

lim sup } f n 



E 



^2k=0 S 



ieA 



K^rc 



71=1 



3 



< 



V 



N 

E 

71=1 



[Z^fc=o 

fn(W* n ) 




(44) 



where V > is a predefined control parameter. 

Proof of Theorem [5| Let tt*^ be the frame-based 
randomized policy that solves (f39]> - (|40]> . Let (W n )^ =1 be the 
optimal average delay vector, and P , where P < P con st, be 
the associated power consumption. In frame k G Z + , the ratio 
E [ T ^p fc fc ^ Xfc ] evaluated under policy 7r* and and genie decisions 
r* nk = W n for all classes n is equal to 



N 

^ t ± n.i "■<> ' ' :, 
n=l 

N 



N 



(45) 



+ E ( V fn(K) - Y n ,k KW* n ) <VY, fnOK) 



71=1 



Since the PwDelayFair policy minimizes E[T k '(p k )'\x k ] ln ever y 
frame k, the ratio under the PwDelayFair policy satisfies 



N 



E[Tfe(P fc ) | Xk] ^ 
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Then ( |43j ) under the PwDelayFair policy satisfies 



A(x fe ) + V^E 



JY 



^2fn(r n , k )T k (P k ) | 

_n=l 

N 

<C + VE[T k (P k ) \xk}J2f^ W *n) 



(46) 



Removing the second term in ( |46| i and taking expectation, 
we get 



/V 



E [L{ Xk+ i)\ - E [L( X k)\ < C + VE [T k (P k )} £ f n (W* n ). 

n=l 

Summing over k € {0, . . . , K — 1}, and using £*(xo) = 
yields 



E[i(xjf)] < #C + VE 



fc=0 



JV 



n=l 

_ (47) 

where Ci ± C + VE [T (P min )] £^ =1 fn(W* n ), and we have 
used E[T k (P k )] < E[T (P min )]. Inequality g7]) suffices to 
conclude that queues X k and K, ^ for all classes n are all 
mean rate stable. From Lemma [6] the constraint P < P const is 
achieved. The proof of |44) follows that of Theorem [2] ■ 

VIII. Simulations 

Here we simulate the DelayFeas and Del ay Fair policy in 
the first two delay control problems; simulations for the 
DynPower and PwDelayFair policy in the last two delay-power 
control problems are our future work. The setup is as follows. 
Consider a two-class M/M/l queue with Poisson arrival rates 
(Ai,A 2 ) = (1,2), loading factors {pi,p 2 ) = (0.4,0.4), and 
mean exponential service times E [X{\ = p\/\\ = 0.4 and 
E [X 2 ] — p 2 /X 2 = 0.2 (we use service times directly since 
there is no power control). The average delay region W of 
this two-class M/M/l queue, given in ((29]), is 



W={ (W U W 2 ) 



W 1 + W 2 = 2.4 
Wx > 0.4, W 2 > 0.4 



(48) 



For the DelayFeas policy, we consider five sets of delay 
constraints (di,d 2 ) = (0.45,2.05), (0.85,1.65), (1.25,1.25), 
(1.65,0.85), and (2.05,0.45); they are all (0.05,0.05) away 
from a feasible point on W. For each constraint set (di,d 2 ), 
we repeat the simulation for 10 times and take an average on 
the resulting average delay, where each simulation is run for 
10 6 frames. The results are given in Fig. [3] which shows that 
the DelayFeas policy adaptively yields feasible average delays 
in response to different constraints. 

Next, for the DelayFair policy, we consider the following 
delay proportional fairness problem: 

' ' ' -Wl + 2W 2 2 



minimize: 



subject to: 



{Wx,W 2 ) € W 
Wi < 2,W 2 < 2 



(49) 
(50) 
(51) 



where the delay region W is given in ( |48) , The additional 
delay constraints ( |5T| l are chosen to be non-restrictive for 



1.6 



1.2 



0.8 



0.4 



Delay region W 
Delay bounds (d1, d2) 
Simulation results 



0.4 0.8 1.2 1.6 2 

Class 1 average delay 

Fig. 3. The performance of the DelayFeas policy under different delay 
constraints (di , 6,2 ) ■ 



the ease of demonstration. The optimal solution to (|4"9"|)-(|5l"j) 

is (Wl,Wl) = (1.92,0.48); the optimal delay penalty is 
±(Wl) 2 + 2(W* 2 ) 2 = 2.304. We simulate the DelayFair policy 
for different values of control parameter V £ {10 2 , 10 3 ,5 x 
10 3 , 10 4 }. The results are in Table [i] Every entry in Table [i] 



V 


Y^DelayFair 
W ^ 


YjrDelayFair 
W 2 


Delay penalty 


100 


1.611 


0.785 


2.529 


1000 


1.809 


0.591 


2.335 


5000 


1.879 


0.523 


2.312 


10000 


1.894 


0.503 


2.301 


Optimal value: 


1.92 


0.48 


2.304 



TABLE I 

The average delays and delay penalty under the DelayFair 

POLICY FOR DIFFERENT VALUES OF CONTROL PARAMETER V. 



is the average over 10 simulation runs, where each simulation 
is run for 10 6 frames. As V increases, the DelayFair policy 
yields average delays approaching the optimal (1.92, 0.48) and 
the optimal penalty 2.304. 

IX. Conclusions 

This paper solves constrained delay-power stochastic op- 
timization problems in a nonpreemptive multi-class M/G/l 
queue from a new mathematical programming perspective. 
After characterizing the performance region by the collec- 
tion of all frame-based randomizations of base policies that 
comprise deterministic power control and nonpreemptive strict 
priority policies, we use the Lyapunov optimization theory to 
construct dynamic control algorithms that yields near-optimal 
performance. These policies greedily select and run a base 
policy in every frame by minimizing a ratio of an expected 
"drift plus penalty" sum over the expected frame size, and 
require limited statistical knowledge of the system. Time 
average constraints are turned into virtual queues that need 
to be stabilized. 

While this paper studies delay and power control in a 
nonpreemptive multi-class M/G/l queue, our framework 
shall have a much wider applicability to other stochastic op- 
timization problems over queueing networks, especially those 
that satisfy strong (possibly generalized [36]) conservation 
laws and have polymatroidal performance regions. Different 
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performance metrics such as throughput (together with ad- 
mission control), delay, power, and functions of them can be 
mixed together to serve as objective functions or time average 
constraints. It is of interest to us to explore all these directions. 

Another connection is, in fl5) , we have used the frame- 
based Lyapunov optimization theory to optimize a general 
functional objective over an inner bound on the performance 
region of a restless bandit problem with Markov ON/OFF 
bandits. This inner bound approach can be viewed as an ap- 
proximation to such complex restless bandit problems. Multi- 
class queueing systems and restless bandit problems are two 
prominent examples of general stochastic control problems. 
Thus it would be interesting to develop the Lyapunov opti- 
mization theory as a unified framework to attack other open 
stochastic control problems. 
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Appendix A 

Proof of Lemma ^Pt We index all AM nonpreemptive strict 
priority policies by where J = {1, . . . , A!} is an 

index set and ttj denotes the jth priority ordering. Consider a 
randomized policy 7r lanc j defined by the probability distribution 
{a.j}j e j, where 7r ran( j uses priority ordering iij for the duration 
of a frame with probability ctj in every frame. Let W^ m (Trj) 
denote the sum of queueing delays in class n during a frame in 
which policy ttj is used. Likewise, define W^ um (7r ran d) under 
policy 7r ran( j. By conditional expectation, we get 

E [WrV**)] = £ a i E i W n m (^)} ■ (52) 
jej 

Next, define W n (7Tj) as the average queueing delay for class 
n if policy ttj is used in every frame. Define VFn(7r ranl j) under 
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policy 7r lan( j similarly. From renewal reward theory, we have 



E [WOrand)] = A„^„(^ rand )E [T] 



(53) 



E lW?r(^)} = KW n (7Tj)E [T] , (54) 

where E [T] is the average frame size. Note that E [T] is 
independent of scheduling policies. From (f52]l-(|54]i we get 



(55) 



Define x n (nj) = p n W n (nj) for all priority orderings nj. 
Define x„(7r lan( i) similarly. Multiplying ( |55j ) by p n for all 
classes n and noting that vertices of the polytope fl are 
performance vectors of strict priority policies, we have 

(^ (TTrand))^! = ^ ^ ( X ™ ) )n=l G fi > 

which proves the first part. 

In the converse, for any given vector {W n )n=i m tne delay 
region W, there exists a probability distribution {(3j}jeJ sucn 
that 

PnW n = ^ Xniwj) W n = ^ Wnfc) 



for all classes n. From (m), the randomized 7r ran d policy 
defined by the probability distribution {f3j}jej achieves the 
desired average delays {W n )n=i- H 

Appendix B 

Lemma 7. In a multi-class M /G/l queue with N classes and 
a constant service rate ( assuming a constant power allocation 
and no power control), if the first four moments of service 
times X n are finite for all classes n G {1, . . . , N}, and that 
the system is stable with J^Li ^nE [X n ] < 1, then, in every 
frame k G Z + , the expectation 



E 



K!-0) : 



is finite for all classes n under any work-conserving policy. 

Proof of Lemma [7f For brevity, we only give a sketch of 
proof. Using E [(a - b) 2 ] < 2E [a 2 + b 2 ], it suffices to show 



E 



( E <T 

i£A n 



E 



Note that Bk and Nk are dependent because a large busy 
period serves more jobs. By Cauchy-Schwarz inequality we 
have 



E [B 2 k Nl] < 



E[S4]E[iV fe 4 ] 



It suffices to show that both E [B%] and E [N%] are finite. 

First we argue E [B^\ < oo. In the following we drop the 
index k for notational convenience. Since the frame size B 
is the same under any work-conserving policy, we consider 
LIFO scheduling with preemptive priority. In this scheme, 
let ao denote the arrival that starts the current busy period. 
Arrival ao can be of any class, and the duration it stays in the 
system is equal to the busy period B. Next, let {ai, . . . , om} 
denote the M jobs that arrive during the service of job ao. 
Let B ( 1 ),..., B(M) denote the duration they stay in the 
system. Under LIFO with preemptive priority, we observe that 
B(l), . . . ,B(M) are independent and identically distributed 
with the starting busy period B (since any new arrival never 
sees any previous arrivals, and starts a new busy period). 
Consequently, we have 



B = X 



M 

E 

m— 1 



B(m), 



(56) 



where X denote the service time of ao. Note also that each 
duration B(m) for all m € {1, ... , M} is independent of M. 
By taking square and expectation of ( |56] >, we can compute 
E [_B 2 ] in closed form and show that it is finite if the first two 
moments of X n for all n are finite. Likewise, by raising < [56] > 
to the third and fourth power and taking expectation, we can 
compute E [_B 3 ] and E [i? 4 ] and show they are finite if the 
first four moments of X n are finite (showing E [_B 4 ] < oo 
requires the finiteness of the first three moments of B). 

Likewise, to show E [iV 4 ] is finite, under LIFO with pre- 
emptive priority we observe 



N = 1 + Y N(m), 



(57) 



are both finite. We only show the first expectation is finite; 
the finiteness of the second expectation follows that of the 
first expectation. Define Nk as the number of arrivals of all 
classes served in frame fc; we have |A„ j,| < Nk for all k and 
classes n. In the fcth frame, since the queueing delay \ of 
each job i € A n ^ is bounded by the busy period Bk, we have Summing over k £ {0, . . . , K — 1} and using Y n ,o = yields 



where N(m) denotes the number of arrivals, including a m , 
served during the course of arrival a m staying in the system; 
N(m) are i.i.d. and independent of M. By raising ( |57| i to the 
second, third, and fourth power and taking expectation, we can 
compute E [A^ 4 ] in closed form and show it is finite. ■ 

Appendix C 
Proof of Lemma |?J From ( fP7| ) we get 

YnM +1 > F n , fe - r rhk \A n .k\ + W Sk- 



E 



E ^ 



(i) 
n , k 



< E [B 2 k Nl 



K-l 



E E ™, 

fc=0 iGA„ u 



(i) 
n.k 



Y, 



n,K 



< 



K-l 

E 

k=0 



r n,k l-^ra.fcl 
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Taking expectation and dividing by A n E J2k=a Pf 



E 


v A ' _1 v w {i) ~ 

£*ik=0 Z^iieA„ tk vy n,k 


E [Y n>K ] 


A„E 


-Lfc=0 £ k 


X n KE [T ] 



< 



E 




^*n,fc l^n 


k\_ 


A„E 


Y^A"-1 rp ' 





yields 



(58) 



where in the second term we use E [T&] = E [To] for all k. In 
the last term of (J5SJ, since the value r n>k is independent of 

\A n k \ and T k , we get 



(59) 
yields 
(60) 



E 




E 


■^A--l rp 

2-,k=0 r n,k J-k 


A„E 


'v^A--l rji ' 

l^k=0 1 k 


E 


V^A--1 rr\ ' 

l^k=0 1 k 





Defining as 9% the left side of ( |58) and using ([58 



7 K 



< 



E 


Tr-^K— 1 /-rn 


E 


\-^K-l rp ' 

Z^fe=0 1 k 





Since /„(•) is nondecreasing for all classes n, we get 



n— 1 

JV 

< lim sup ^2 fn 



E 


\-^K-l rp ' 


E 







(61) 



Define the value 



Vk 



E 






E 







Efy, 



n.A"J 



„(n) 

A ' A„XE[T t 



To complete the proof, from ( |6T) it suffices to show 



(62) 



JY 



JV 



lim sup ^2 fn tn { K ) = lim sup ^U^k )• (63) 



Let the left-side of ( |63| ) attains its lim sup in the subsequence 
{JC m }S=i. It follows 



JV 



N 



lim sup V /„ (rf£ ) = lim V /„ (rjP ) 



K^roo 



n=l 



(a) 



N 



Efn ( lim 
\ m. — ^ r> 



(n) 



(6) 



n=l 
JV 



lim 9 



< 



E /« ( 11 

n=l 

iV 

limsup^/„ 



(n) 
A',„ 



4 n) ), 



where (a) follows the continuity of /„(•) for all classes n, 
(b) follows ( |62| > and mean rate stability of Y n . k . The other 
direction can be proved similarly. ■ 



Appendix D 

We show how to remove the dependence on the second mo- 



ments of job sizes S n in the Dyn Power policy in Section VI-B 
Using (37) , we rewrite ( [30) as 



R 



N 



JV 

E 



\ (rl(Pk) ~ £1 = Pvr m )(/i( J Pfc) - £" =Q Pir„ 



(64) 



where 



«=2L^K],p, m = otherw . se _ 

By ignoring constant i? and redefining V" = in ( [64) , it is 
equivalent in the kth frame of the Dyn Power policy to allocate 
power P k S [Pmin, -Pmax] that minimizes 



JV 



F^A„E [S n ] 

\ n=l , 
N 

+ E 



Pk 



KPk) 



Z-rr„,k Ajr,, 



(65) 

The sum ( |65) does not depend on second moments of job 
sizes. From Theorem [3] and using V = VR, this alternative 
policy yields average power P satisfying 

VR 

and we preserve the property that the resulting average P is 
0(1/ V) away from the optimal P*. 

Appendix E 
Proof of Lemma^ From ( |4T| ) we have 

Ajc+i > A^ fe + PkB k (P k ) — P CO nst T k {P k ). 

Summing over k £ {0, . . . , K — 1}, taking expectation, and 
using Xq — yields 



E [X K ] > E 



fe=0 



-fcnnst E 



E T *( p *) 



Dividing by E^f^Tfe^ 



Lfc=o 

and passing K — > 00 yields 



ELY 



-P < -Pconst + lim sup _ 

K^roo & 



K 



^\Ek=oT k {Pk) 



N 



< Ponst + lim sup — — — } A.,, 

n=l 



where (a) uses E [T k {P k )] > E [I k ] = l/(£«=i A„). Then the 
result follows by mean rate stability of queue {X^j^g. ■ 



